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Abstract 

It has recently been proved that the popular nonlocal means (NLM) denoising 
algorithm does not optimally denoise images with sharp edges. Its weakness 
lies in the isotropic nature of the neighborhoods it uses to set its smoothing 
weights. In response, in this paper we introduce several theoretical and 
practical anisotropic nonlocal means (ANLM) algorithms and prove that they 
are near minimax optimal for edge-dominated images from the Horizon class. 
On real- world test images, an ANLM algorithm that adapts to the underlying 
image gradients outperforms NLM by a significant margin. 
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1. Introduction 

Image denoising is a fundamental primitive in image processing and com- 
puter vision. Denoising algorithms have evolved from the classical linear and 
median filters to more modern schemes like total variation denoising [51j, 
wavelet thresholding ^28j , and bilateral filters [211 ESI EH [66j . 
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Figure 1: An example of a Horizon class image that features a smooth edge 
contour that separates the white region from the black region. 

A particularly successful denoising scheme is the nonlocal means (NLM) 
algorithm [8j, which estimates each pixel value as a weighted average of 
other, similar noisy pixels. However, instead of using spatial adjacency or 
noisy pixel value as the similarity measure to adjust the estimate weights, 
NLM uses a more reliable notion of similarity based on the resemblance 
of the pixels' neighborhoods in high-dimensional space. This unique feature 
benefits NLM in two ways. First, it provides more accurate weight estimates. 
Second, it enables NLM to exploit the contribution of all pixels in the image. 
In concert, these features enable NLM to provide remarkable performance 
for a large class of image denoising problems. 

Nevertheless, in a recent paper, we have proved that NLM does not attain 
optimal performance on images with sharp edges from the so-called Horizon 
class (see Figure [T]) [41jj^ Indeed, NLM's theoretical performance is more or 
less equivalent to wavelet thresholding, which was shown to be suboptimal 
in [TTj . Empirical results have also confirmed the suboptimality of NLM in 
estimating sharp edges [HI [15[ HSl |^ . The core problem is that NLM (and 
wavelet thresholding) cannot exploit the smoothness of the edge contour that 
separates the white and black regions. 

In this paper, we introduce and study a new denoising framework and 



^Recently [5 has extended our results to more complicated image/edge models. 
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Figure 2: Comparison of (left) the isotropic neighborhoods employed by Non 
Local Means (NLM) versus (right) anisotropic neighborhoods employed by 
Anisotropic NLM (ANLM). 



prove that it is near-optimal for Horizon class images with sharp edges. 
Anisotropic nonlocal means (ANLM) outperforms NLM, wavelet threshold- 
ing, and more classical techniques by using anisotropic neighborhoods that 
are elongated along and matched to the local edge orientation. Figure [2] com- 
pares the neighborhoods used in ANLM with those used in NLM. Anisotropic 
neighborhoods enable ANLM to distinguish between similar and dissimilar 
pixels more accurately. 

We develop three different ANLM algorithms of increasing levels of practi- 
cality. Oracle Anisotropic Nonlocal Means (OANLM) assumes perfect knowl- 
edge of the local orientation of the edge contour and is used primarily for our 
theoretical optimality analysis. Discrete- angle Anisotropic Nonlocal Means 
(DANLM) optimizes the choice of the anisotropic neighborhood around each 
pixel in order to achieve near-optimal performance without any oracle in- 
formation. Since it is more computationally demanding than NLM, we in- 
troduce an algorithmic simplification. Gradient based Anisotropic Nonlo- 
cal Means (GANLM) uses image gradient information to estimate the edge 
orientation; using simulations, we demonstrate that GANLM significantly 
outperforms NLM in practice on both Horizon class and real-world images. 

Anisotropy is a fundamental feature of real-world images. Hence, many 
denoising algorithms have been proposed to exploit this feature O [HJ [T4l - 
[iniESlSailSlSZlEaEa wlU review aU these algorithms 
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and their similarities and differences with ANLM in Section [H 

The paper is organized as foUows. Section [2] explains the minimax frame- 
work we use to analyze the denoising algorithms and reviews the necessary 
background. Section [3] introduces the OANLM and DANLM algorithms and 
presents the main theorems. Section [4] addresses some of the practical ANLM 
issues by introducing GANLM and summarizes the results of a range of simu- 
lations using synthetic and real- world imagery. Section [5] reviews the related 
work in the literature. Section [6] discusses our current and potential future 
results. Section [7| contains the proofs of the main theorems. 

2. Minimax analysis framework 

In this section we introduce the minimax framework [28l |37j and the 
Horizon class image model considered in this paper. Note that, in order 
to streamline the proofs, we take a continuous-variable analysis approach in 
this paper, in contrast to our approach in 01]. The moral of the story is 
compatible with [41j, however. 

2.1. Risk 

We are interested in estimating an image described by the function / : 
[0, 1]^ ^ [0, 1] (/ G I/^([0, 1]^)) from its noisy observation 

dY{ti,t2) = f{tiM)dtidt2 + adW{ti,h)- (1) 

Without loss of generality, we consider only square images. Here W{ti^t2) 
is the Wiener sheet and a is a constant that scales the noise. For a given 
function / and a given estimator /, define the risk as 

W,/) = E(||/-/||i), 

where the expected value is over W . The risk can be decomposed into bias 
squared and variance terms 

i?(/,/) = ll/-E/||^ + E||/-E/||2. 



^The Wiener sheet is the primitive of white noise. 
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Let / belong to a class of functions T . The risk of the estimator / on 
is defined as the risk of the least-favorable function, i.e., 

i?(^,/) = supi?(/,/). 

The minimax risk over the class of functions J-" is then defined as 

i?*(J') = infsupi?(/,/). 

R*{J-') is a lower bound for the performance of any estimator on J^. 

In this paper, we are interested in the asymptotic setting as a ^ 0. For 
all the estimators we consider, /) ^ as a ^ 0. Therefore, following 

[IDl ^ we will consider the decay rate of the minimax risk as our measure of 
performance. For this purpose, we will use the following asymptotic notation. 

Definition 1. /(a) = 0{g{a)) as a ^ 0^ if and only if there exist ctq and 
c such that for any a < ctq; \f{cr)\ < c\g{a)\. Likewise^ f{a) = Q{g{a)) 
as a ^ 0, if and only if there exist ctq and c such that for any a < ao, 
\f{(j)\ > c\g{a)\. Finally, f{a) = Q{g{(j)), if and only if f{(j) = 0{g{a)) 
and f{a) = Q{g{a)). We may interchangeably use f{a) x g[a) for f{a) = 

e(^(cT)). 

Definition 2. /(a) = o{g{a)) if and only z/limcr^o = 0. 

2.2. Horizon edge model 

In our analysis, we consider the Horizon model that contains piecewise 
constant images with sharp step edges lying along a smooth contour [TTl [Ml 
[48j (our analysis extends easily to piecewise smooth edges). Let Holder^ {C) 
be the class of Holder functions on R, defined in the following way: h G 
Holder^ {C) if and only if 

\h^^\ti) - h^^\t\)\ <C|^l-^;r-^ 

where k = [aj . Consider a transformation that maps each one-dimensional 
edge contour function /i to a two-dimensional image //^ : [0, 1]^ ^ [0, 1] via 

fhih^h) = l{^2</^(^l)}• 
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The Horizon class of images is then defined as 

H'^iC) = {fh{tiM) ' h G Hdlder''{C)nHdlder\l)}, (2) 

where a is the smoothness of the edge contour. Figure [T] iUustrates a sample 
function of this class. The following theorem characterizes the minimax rate 
ofH^{C) [IZlEilllSig 

Theorem 1. [I7l[34l[48] For a > 1, the minimax risk of the class H^{C) is 

Achieving the minimax rate is a laudable goal that any well-respecting 
denoising algorithm should aspire to. In this paper, we will focus primarily 
on a = 2 edge contours, for which the optimal minimax decay rate is a^/^. 
However, it is straightforward to draw similar conclusions for the other values 
of a. 



2.3. Image denoising algorithms 

Minimax risk analysis of the classical denoising algorithms has revealed 
their suboptimal performance on images with sharp edges. Table [T] summa- 
rizes several of these algorithms and their minimax decay rates (up to a log 
factor)!^ The suboptimality of wavelet thresholding has led to the develop- 
ment of ridgelets [Hj, curvelets wedgelets platelets [63], shearlets 
[35j, contourlets pj], bandelets [46], directionlets [6l] and other types of 
directional transforms. See [271 Il2] and the references therein for more in- 
formation. In the framework of this paper, wedgelet denoising [IT] provably 
achieves the optimal minimax rate. However, it performs poorly on textures, 
which has limited its application in image processing. 



^The models considered in [T71[34l[48] are slightly different from the continuous frame- 
work of this paper. Therefore, for the sake of completeness we prove Theorem [l] in |Ap- 



pendix C 

^The analysis framework used in and fAV is discrete rather than continuous. There- 
fore, for the sake of completeness we establish the results for the mean filter and NLM in 
Appendix A and Appendix B, respectively. 
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Table 1: Minimax risk decay rates of several classical image denoising al- 
gorithms; recall that the according to Theorem [T| for a = 2 the optimal 
minimax rate is a^/^. 



Algorithm 


Minimax Rate 


Mean filter [41 




Wavelet thresholding [TTJ |48J 




Nonlocal means |41J 





2.4' Nonlocal means denoising 



In 2005, Buades, Coll, and Morel significantly improved the performance 
of the bilateral filter [57j by incorporating a new notion of pixel similarity. 
As in the mean filter, NLM estimates each pixel value using a weighted 
average of other pixel values in the image. However, NLM sets the weights 
according to the similarity between the pixel neighborhoods rather than the 
pixel values. Furthermore, in contrast to the bilateral filter, in which only 
the vicinity of each pixel contributes to the estimate, in NLM all pixels may 
contribute. 

Specifically, the NLM algorithm is defined as follows. Let 5' = [0, 1]^ 
represent the domain of the image. Define the 5-neighborhood of a (^1,^2) as 



5 5' 
^^-2''^ + 2 



X 



5 5' 



(3) 



Following the definition of NLM in the discrete setting, we pixelate this 



neighborhood by partitioning /^(ti, ^2) into subregions /; 



jlj2 A 



[t 



Ml 



X [t2 



2n ' * 



1~ 2^'^1 



t2 + The pixelated neighborhood of (^1,^2) is defined as 



y^^ t2 ^ K-"^" and satisfies 
We further define the pixelated process as 



Xiti,t2) 



n 



L 



dY{si,S2). 



(si,S2)&Is/n(tl,t2) 
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Define the (^-neighborhood distance between two points in the image as 

4{dY{h,t2),dY{Si,S2)) 

= i\\yU-yLsM - |yLJo,o) -yf^,,,(o,o)n , (4) 

where ||yf^^t2ll2 ~ Si j(yfi,f2(^''^))^' ^^^e that in contrast to the definition 
in |8j, we have removed the center element \yf^^^{0^0) — yi^^s2i^^^)\'^ from 
the summation. As we wiU see in Section [sj ^ oc as a ^ 0, and hence 
the effect of removing the center pixel is negligible on the asymptotic perfor- 
mance. But, as we will see below, removing the center element simplifies the 
calculations considerably. NLM uses the neighborhood distances to estimate 

where w^^^{si^ S2) is set according to the (^-neighborhood distance between 
dy(ti, ^2) and dY{si^ '^2)^ It is straightforward to verify that Ed^((iy(ti, ^2), 
dY{si,S2)) = dj{f{ti,t2),f{si,S2)) + which suggests the following 

strategy for setting the weights: 

. ,^A^1 iidj{dY{t,,t2).dY{s,,S2))<'-^ + r^. ... 
^^'^2^ ^' \ otherwise, ^ ^ 

where is the threshold parameter. Soft/tapered weights have been explored 
and are often used in practice [8]. However, the above untapered weights 
capture the essence of the algorithm while simplifying the analysis. 

The distinguishing feature of NLM — the weighted averaging of pixels 
based on the neighborhoods — produces a decay rate that is superior to 
that of linear filters as shown in [41]. However, compared to the optimal 
rate of a^/^, NLM remains suboptimal. We introduce the anisotropic NLM 
algorithm in the next section to address this gap in performance. 

3. Anisotropic nonlocal means denoising 

In this section, we introduce the Anisotropic Nonlocal Means (ANLM) 
concept that exploits the smoothness of edge contours via anisotropic neigh- 



^We assume that both w^^^^^{si, S2)X{si^ S2) and w^^^^^{si^ S2) are Lebesgue integrable 
with high probability. 
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borhoods. We then present OANLM and DANLM algorithms and explain 
their performance guarantees. 



3.1. Directional neighborhoods 

We now formally introduce the notion of directional neighborhoods. Then 
we extend NLM to exploit such neighborhoods. This will lay the foundation 
for the OANLM and DANLM algorithms. 

Let Rl^i^{-) represent the rotation operator on a neighborhood; when ap- 
plied to a generic point (u^v) G S in the neighborhood around {i^^jj)^ R^^^ 
rotates {u^ v) by 0° counter-clockwise around the point (z/, /x). For a set Q C S' 
we define Qe = R^^j^iQ) as 

{s,t) eQe ^ 3{u,v) G Q such that {s,t) = R^ {{u,v)). 



The {6 ^ 5 5 -anisotropic neighborhood of the point (^1,^2) is defined as 



Si 
2'' 



Si 
2 



Ss Sg 



where 9^ 5^, and Sg {Sg < S^) represent the orientation angle, length, and width 
of the neighborhood, respectively. Figure [3] displays such neighborhoods 
for two different pixels. We partition Iqs (5^(^i,^2) into Ug x subregions 

i^Cs.(ii^t2) ^ RUih - + X [t, - + ft]). 

The pixelated neighborhood of {ti, ^2) is defined as the y G R"«^"< satis- 
fying 



yti,t2 



SgSi 




dY{si,S2). 



Define the neighborhood process as 

X(ti,t2) 



SsSi 




dY{si,S2). 



ns ' Uf 



The anisotropic (5^, 5^, 0)-neighborhood distance between two given points in 
the image is then defined as 



risUi 



^ (\\ytft-ytft\\l - ly'5/^(o,o) -yfSf (o,o)rt . (6) 



h 

Figure 3: The anisotropic neighborhood lo^Ss^Siih^h) in the discrete setting 
for two different pixels of a Horizon class image. 



The ANLM estimate at point (^1,^2) is given by 

I I(s^,s,)es'^tf,tf'{Sl,S2)dSldS2 

where the weights are obtained from 

^^'^^ ' \ otherwise. 

Here r^- is the threshold parameter and ^^f^^^^ + t^j is the threshold value. 

ANLM extends NLM in two significant ways. First, in ANLM O^Sg^Si 
are free parameters, while in NLM 9 = and Sg = S^. As Figures |4] and 
[5] demonstrate, these free parameters significantly affect the performance of 
ANLM. Figure [4] illustrates the visual denoising results as we vary the angle 
of the anisotropic neighborhood. Figure [5] does the same for the empirical 
peak signal-to-noise ratio (PSNR)j^ 



^In practice images are measured on a discrete lattice, i.e., the pixelated values are 
known. Let the pixelated values of the original noise-free image and the estimator be 
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(a) 0°, PSNR = 20.6dB 



(b) 45°, PSNR = 19.2dB 




(a) 90°, PSNR = 20.6dB 



(d) 135°, PSNR = 28.9dB 



Figure 4: Effect of the anisotropic neighborhood orientation angle 6 on the 
denoising performance of ANLM. The images are 256 x 256 pixels with a true 
edge orientation of 135°. The noise standard deviation is = 0.5, which is 



equivalent to PSNR = 6.1dB (see Section 4.1). The neighborhood size and 
threshold parameter are 8s = 1/256, 5^ = 20/256 and r = 2.7^^, respectively. 
The angles of the neighborhoods (9) and the resulting PSNR of the estimates 
are noted below the images. There is a substantial subjective and objective 
improvement in the performance of ANLM when the neighborhood is aligned 
with the edge. 
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Figure 5: Effect of the anisotropic neighborhood orientation angle 6 on the 
denoising performance of ANLM. The experimental parameters are the same 
as in Figure [4} For each value of 6 we plot the result of 10 Monte Carlo 
simulations. PSNR of the original noisy image is 6.1dB 



3.2. Oracle ANLM 

It is clear from Figures |4] and [5] that we should align the ANLM neigh- 
borhood locally with the edge to maximize denoising performance. In the 
Oracle ANLM (OANLM) algorithm, we assume that we have access to an 
oracle that knows the image's edge orientation at each pixel. Consider 
fhih^h) ^ H^{C). Let h'{ti) denote the derivative of the edge contour 
and let F^ = {(^1,^2) = 7} segment the region S. Define 9^ as the 
angle between the tangent to the edge contour h and the horizontal line at 
ti = 7. OANLM is defined as ANLM with the following parameter settings: 

• Quadratic scaling: Instead of using isotropic (square) neighborhoods, 
we use anisotropic (rectangular) neighborhoods of size 5^ x 5^. The 
scaling of 6s and 6^ depends on the smoothness of the edge. Since 



fij and fij^ respectively. The empirical mean-squared-error (MSE) of an estimator / is 
defined as MSE = ^ "^Zi jifij ~ fij)'^- 1^ fiJ ^ [0?^]? then the empirical peak signal-to- 
noise ratio (PSNR) is defined as lOlog^^g i^e* 
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Figure 6: The OANLM neighborhoods satisfy the quadratic scahng 5s = 
and are ahgned with the edge. In regions far from the edge, isotropic 
neighborhoods perform just as well. 



we are primarily interested in i7^(C), we will emphasize the quadratic 
scaling (5, = ^a^'^\ \og(j\''^ and 5^ = 2a^/^\ logap/^; that is = Si 



• Aligned neighborhoods: We align the neighborhood at point (^1,^2) to 
the orientation 9t^ of the edge contour at Thus all pixels 
in a column have the same orientation. Figure [6] illustrates such a 
neighborhood selection]^ 

• Logarithmic pixelation: We assume that rig x Ui = [log^ a] . In other 
words, as a increases the resolution of the pixelated image increases as 
well. 

Theorem 2. If is the OANLM estimator with the threshold = / ^ , 
then 

R{H^{C)J^) = 0(a^/^|loga|^/^). 



^If the smoothness of an edge is a > 1, then the optimal neighborhood size is given 

by 6s = 6(cr2"/("+^)| logCrp«/("+l)) ^ @(^2/(a+l)| iQg^|2/(a+l))^ ^^^^ ^^^^ 

special case a = 2 they satisfy the quadratic scaling. 

^In smooth regions, where the pixel neighborhoods do not intersect with the edge, 
neither the shape nor the orientation of the neighborhood affect the denoising performance. 
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The proof of Theorem [2] can be simply modified to provide a bound for the 
risk of NLM as weU. Let 5 = and n = s/n^ = \ log(cr)|, where Sg^ S^^ 

ris^ n£ are as given in OANLM algorithm. 

Corollary 1. If is the NLM estimator with threshold = 2/^| log(a)|^ 
then 

R{H-{C)J'') = 0{a\\oga\). 



See Section 7^ for the proof of this corollary. Also, under certain mild 
assumptions, the risk of NLM can be lower bounded by ^{a). The proof is 
similar to the proof of Theorem 5 in [41j. However, for the sake of complete- 
ness and since our framework differs from [H], we overview the main steps 
of the proof in [Appendix B 



Theorem [2] is based on the strong oracle assumption that the edge direc- 
tion is known exactly. Needless to say, such information is rarely available in 
practical applications. Consider a weaker notion of OANLM that has access 



to an estimate 6^ of 6^ that satisfies 



1^7-^7! < eK). (8) 

OANLM with exact edge orientation information corresponds to /5 = oc 
in this model. When /3 < oc, the choices 5^ = 2cr^/^| log ap/^ and 5s = 
4(7^/^1 logcr|^/^ are not necessarily optimal. The following theorem character- 
izes the performance of OANLM using 9. 



Theorem 3. Let the OANLM estimator use the inaccurate edge ori- 
entation 9 J satisfying ([s]). Set the neighborhood sizes to 5^ = 
min(a^/^| log^/^ a|, loga|) and Ss = a'^log^ a/S^. Then the risk of the 

estimator with = 2/^| log(cr)| satisfies 

i?(i/"(C),/0) = OaPoly(|loga|)), 
where Poly(| logcr|) is a polynomial of degree at most 2 in terms of \ loga|. 

If the edge estimate is exact, then the result simplifies to the result of 
Theorem [2} However, this theorem confirms that OANLM algorithm achieves 
the optimal rate even if there is an error in 9 of order 0{a^) with /3 > 2/3. 
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Corollary 2. Let the OANLM estimator use the inaccurate edge orienta- 
tion 9^ satisfying ^ with (5 > 2/3. Set the neighborhood sizes to Si = 
2cr^/^| log^^^ al and Sg = 4:a^^^\loga\. Then, the risk of the estimator with 
Tcr = log(cr)| satisfies 

R{H-{C),P) = 0(a^/^Poly(|logcT|)), 
where Poly(| loga|) is a polynomial of degree at most 4 in terms of \ log(a)|. 

Proof If we plug in ;5 > 2/3 in Theorem |3] we obtain 

Se = mmi2a^/'\log'/'a\,2a'-^/^\loga\) = 2a^/''\logaf/' . 
Therefore, Sg = 4:a^/^\ loga\^^^. This estabhshes the result. □ 

We can also simplify the result of Theorem [3] for the case of /5 < 2/3. 

Corollary 3. Let the OANLM estimator use the inaccurate edge orienta- 
tion 9^ satisfying ^ with (5 < 2/3. Set the neighborhood sizes to Si = 
2cri-^/2| log (J I and Sg = 4cr-^+^/^| logcr|. Then, the risk of the estimator with 
Tcr = 2/^1 log((j)| satisfies 

R{H"{C)J'') = 0(ai+^/2poly(|loga|)). 
where Poly(| loga"|) is a polynomial of degree at most 3 in terms of \ \og{a)\. 

Proof. Note that this corollary is exactly the same as Theorem |3} In fact, 
since /9 < 2/3, 

5e = min(a2/3| bg^/^ a|, a^-^/^i ^^^^^^ ^ ^i-/?/2| ^^g^j^ 

Therefore, Ss = \ logcr| and therefore, according to Theorem [3] 

R{H-{C)J'') = 0(a,Poly(|loga|)) = 0(a^+^/2poly(| log 

□ 

This corollary suggests that, as long as we have an edge orientation esti- 
mate that improves as a ^ (or the number of pixels increases), OANLM 
outperforms NLM. Also note that, as /3 decreases, the neighborhoods become 
more isotropic. 
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(u,v) 



Figure 7: DANLM neighborhoods around one pixel location for g = 4. 



3.3. Discrete angle ANLM 

In this section, we introduce the Discrete-angle ANLM (DANLM) al- 
gorithm that achieves the minimax rate without oracle information. The 
key idea is to calculate the neighborhood distance over several directional 
neighborhoods and fuse them to obtain a similarity measure that works 
well for all directions. As for OANLM, set 5s = 4cr^/^| log(cr)|^/^ and 5i = 
2a2/3|log(a)|2/3, and let q = Tra'^/^ Define the angles = 0, = a^/^ 
02 = 2a2/3,...,^,4 7r-a2/3 and 

For a point {u^ v) we consider all of the anisotropic, directional neighborhoods 
for 9 G O. Figure 7 displays four of these neighborhoods. 

Define the discrete angle anisotropic distance between dY{ti^t2) and 
dY{si^ S2) as 

dl^{dY{ti,t2),dY{si, S2)) = min ^2), dY{si, 53)), 

where dlg^ g^{dY{ti,t2), dY{si, S2)) is defined in Q. The DANLM estimate 
is then given by 
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where 



^^'^^ ' 1 otherwise. 



In summary, we note the foUowing features of the DANLM algorithm. 
First, it uses the quadratic scahng Sg = Sf. Second, the optimal neighborhood 
direction can change from pixel to pixel. The following theorem, proved in 
Section [7| shows that the risk of this algorithm is within the logarithmic 
factor of the optimal minimax risk. 

Theorem 4. The risk of DANLM satisfies 

R{m{C^)J'') <0{a"^\\og{a)f^). 

Figure [8] reprises the simulation experiment of Figure [4] using the four 
algorithms described thus far: NLM, OANLM with perfect knowledge of the 
edge orientation, OANLM with imperfect knowledge of the edge orientation, 
and DANLM with four angles. All three of the latter algorithms outperform 
the isotropic NLM. 



4. Practical ANLM algorithms and experiments 

In this section we introduce a practical gradient-based ANLM algorithm 
and then complement the above theoretical arguments with additional simu- 
lations and experiments with synthetic and real- world imagery. Note that the 
algorithms we introduce in this section do not immediately outperform state- 
of-the-art denoising algorithms such as BM3D [13] but are rather intended 
to highlight that anisotropic neighborhoods are more suitable for edges than 
isotropic ones. Since NLM is a key building block in several top-performing 
algorithms [13] [29], we expect that anisotropy will pay off for such algorithms 
as it does for NLM. But addressing this important question is left for future 
research. 

Jf..!. Extension to discrete images 

In practice the observations are noisy pixelated values of an image and 
the objective is only to estimate the pixelated values. In this section we 
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(a) NLM, PSNR = 26 dB (b) OANLM with no error, 

PSNR = 28.9 dB 




(c) ONALM with error, (d) DANLM, PSNR = 27.5 dB 

PSNR = 28.6 dB 



Figure 8: Continuation of the experiment of Figure [4| that compares 
(a) isotropic NLM, (b) OANLM with perfect knowledge of the edge direc- 
tion, (c) OANLM with 10% error in the knowledge of the edge direction, 
(d) DANLM with g = 4 angles. The images are of size 256 x 256 pixels. In 
the ANLM algorithms, Ss and are the same as in Figure [4} The neighbor- 
hood size of NLM is ^/6^~^^~S£. The edge orientation is 135°. The standard 
deviation of the noise is ^ = 0.5 which implies PSNR = 6.1dB. 

explain how the ideas of directional neighborhood and ANLM can be ex- 
tended to the discrete settings. Suppose we are interested in estimating a 
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Figure 9: The anisotropic neighborhood Ie^Ss,Si{^i ^) in the discrete setting 
for two different pixels of a Horizon class image. The bold red dots represent 
the anisotropic neighborhoods. 



nxn image /(-, ^) with noisy observations Oij = /(-, ^) + where Qj 
A^(0,(^^) and ^ is the standard deviation of the noise. The extension of the 
anisotropic neighborhood to the discrete setting is straightforward. Let S = 
{hh---^^A}x{hl,---,^^}^^dS^{l,2,...,n}x{l,2,...,n}. 
For a given set B C S = [0, 1]^ we define B = BnS. 

The discrete (6*, 5^, (5^)- distance between two pixels Ojj and Om/ is defined as 



— O, 



'm-\-r,£-\-q ) i 



(10) 



where V ^ {(r,g) G | i±2) e le^Aih i)}- See Figure§ Note that 
for the pixels that are close to the image boundaries the neighborhoods are 
not rectangular any more and they include the part of the rectangle that 
is inside the image boundaries. Such non-rectangular neighborhoods will 
be compared with non-rectangular neighborhoods of the other pixels for the 
calculation of the distance. The ANLM estimate at pixel /(-,-) is given by 
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where the weights are obtained from the distances ^^{oij^ Om/)- For in- 
stance, the simple pohcy introduced in ^ corresponds to 



^^]3 \jn,i) = 



\ otherwise. 



Using the discrete anisotropic neighborhood and distance, the extensions of 
OANLM and DANLM to the discrete setting is straightforward. 

1^.2. Gradient-based ANLM 

While DANLM is somewhat practical and also theoretically optimal for 
the Horizon class of images, its computational complexity is higher than 
NLM and grows linearly in the number of directions g, where q ^ o{ri^^^). As 
a more practical alternative, we propose Gradient based ANLM (GANLM), 
which adjusts the orientation of an anisotropic ANLM neighborhood using an 
estimate of the edge orientation provided by the image gradients. Pseudocode 
is given in Algorithm [T] Note that GANLM reverts back to NLM in regions 
with low image gradients, since they will not be "edgy" enough to benefit 
from the special treatment. 

There is a rich literature on robust image gradient estimation [6l [22l |39] . 
In our implementations we use the simplest method for estimating the image 
gradient, i.e., 

9h{iJ) = fi+ij - fij^ 

where /(i, j) is an estimate of the image (for instance from isotropic NLM). If 
9h{hj) and gv{hj) the estimated image derivatives at pixel (i, j), then we 
can estimate the local orientation of an edge by 0{i^j) = arctan^|^|^^ . To 

allay any concerns that gradient-based adaptivity is not robust to noise and 
errors, we recall Theorem [3j which establishes the robustness of OANLM to 
edge angle estimation error. For extremely noisy images, numerous heuristics 
are possible, including estimating the image gradients for GANLM from an 



isotropic NLM pilot estimate. Figure 11 builds on Figured with a slightly 



more realistic, curved edge and demonstrates that the pilot estimate approach 
to GANLM performs almost as well as an oracle GANLM that has access to 
the edge gradient. 
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(a) Noisy image, (b) Horizon image (c) NLM, PSNR = 34.4dB 

PSNR = 16.5dB 




(d) DANLM, (e) Oracle GANLM, (f) GANLM, PSNR = 38dB 

PSNR = 36.2dB PSNR = 38dB 



Figure 10: Simulation experiment in the same vein as Figures [4| and [8| but 
with a slightly more realistic curved edge. The images are of size 512 x 
512 pixels. In the ANLM algorithms Sg and 5^ follow the quadratic scaling 
of Theorem [2| The neighborhood size of NLM is \f57><5l. The standard 
deviation of the noise is ^ = 0.15. To show the differences we have zoomed 
on the area represented by the red block in Figure (a). 
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Algorithm 1 Gradient-based ANLM (GANLM) 



Inputs: 

fi^j : Estimate of the image pixel 
8s X 8f Size of the neighborhood 

A: Threshold that determines isotropic/anisotropic neighborhood selection 

Estimate image gradient {gh{h j) ^ Qvih j)) at each pixel {i^j) 
for every pixel {i^j) G / do 

9i j = arctan ( ^^^/^\ ] 

if ^ A then 

Perform ANLM at pixel yij with ds^^SsAj 
else 

Perform NLM at pixel i/ij with ^/S^. 
end if 
end for 



Table [2] summarizes the performance of the algorithms introduced in this 
paper with that of NLM on the natural test images Barbara [1], Boats j2], 
and Wet Paint (3] submerged in noise of two different levels. The table 
enables us to draw two conclusions. First, the performance of the practical 
GANLM algorithm is very close to the oracle GANLM algorithm. Second 
these two algorithms outperform DANLM in all but one case and significantly 



outperform standard NLM in all cases. We use 4 discrete angles^ in DANLM 
and attribute the superior performance of GANLM over DANLM to a more 
accurate selection of orientations. 

4^3. Computational complexity of ANLM 

Improving the risk performance of linear filters by nonlocal averaging is 
achieved at the price of higher computational cost. If |7^| denotes the neigh- 
borhood size, then the computational complexity of NLM is proportional 
to B(n'^|7^|) as compared to B(n^) for linear filters. Such high complexity 
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Experiments with larger q did not differ appreciably in PSNR at this image size. 



22 



Table 2: PSNR Performance of NLM and ANLM algorithms on three natural 
test images at two noise levels. 



Test image 


Algorithm 


^ = 0.25, 12.1dB 


^ = 0.15, 16.5dB 




NLM 


22.48 


25.86 


Barbara 


DANLM 


23.15 


26.27 


(512 X 512) 


Oracle GANLM 


23.51 


26.63 




GANLM 


23.50 


26.60 




NLM 


22.75 


25.83 


Boats 


DANLM 


23.45 


26.45 


(512 X 512) 


Oracle GANLM 


23.88 


26.45 




GANLM 


23.75 


26.35 




NLM 


27.66 


30.49 


Wet Paint 


DANLM 


28.41 


30.75 


(1024 X 1024) 


Oracle GANLM 


29.02 


31.18 




GANLM 


28.86 


31.07 



is not acceptable in many applications. Therefore, several effective meth- 
ods have been proposed to reduce the computational complexity of NLM. 
See [71 El HHl [45j and the references therein for more information on the re- 
search in this direction. These approaches have reduced the computational 
complexity of NLM to an acceptable limit for practice. 

The computational complexity of ANLM is in general higher than that 
of NLM. For instance, the computational complexity of the Gradient-based 
ANLM algorithm introduced in Section |4.2| is roughly two times that of 
NLM if we ignore the gradient calculation, which is 0(n^). The computa- 
tional complexity of DANLM is G(gn^|7^|), where q denotes the number of 
angles considered. Most of the approaches proposed for speeding up NLM 
are applicable to ANLM as well. Furthermore, due to specific structure of 
DANLM, in particular the overlap of the neighborhoods of each pixel, we 
expect to be able to reduce its computational complexity even further. But 
this question is left for future research. 

5. Related work on anisotropic denoising 

Anisotropy is a fundamental feature of real- world images. As a result, 
anisotropic image processing tools can be traced back at least as far as 
the 1970s [44j. Here, we briefiy compare and contrast some of the relevant 
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(c) Wet Paint 



Figure 11: Zooms of the natural test images denoised by NLM (left) and 
ANLM (right) studies in Table [2| The standard deviation of the noise is 
^ = 0.15 that implies PSNR= 16.5dB. 
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anisotrpic denoising schemes with ANLM. 

Anisotropic filtering methods use a space- varying hnear convolution filters 
to reduce blur along image edges. Toward this goal, 01] considers several 
different neighborhoods around a pixel and selects the most "homogeneous" 
neighborhood to smooth over. A more advanced version of this idea can be 
found in pQl [55l [56]. There are major differences between such algorithms 
and NLM/ANLM; in particular, the estimators are local and do not exploit 
global similarities. See also [43j for a more complete overview of such algo- 
rithms and their differences with NLM/ANLM. 

Anisotropic diffusion methods smooth a noisy image with a Gaussian 
kernel. As the standard deviation of the kernel increases, the smoothing 
process introduces larger bias to the edges. In ^26^ the authors proved 
that the set of images derived by this approach can be viewed as the solution 
to the heat diffusion equation. Perona and Malik [47] noted the isotropy 
in the heat equation and introduced anisotropy. Their anisotropic diffusion 
starts with the heat equation but at every iteration exploits the gradient 
information of the previous estimate to increase the conductance along the 
edges and decrease it across the edges. See |30l [311 |58] for more recent 
advances in this direction. Efforts to theoretically analyze the risk of such 
algorithms have left many open questions remaining [4j. It is conjectured 
that there is a connection between such algorithms and iteratively applying 
nonlinear filters such as the median filter. It is worth noting that the idea 
of applying an image denoising algorithm iteratively and guiding it at every 
iteration based on previous estimates goes back to Tukey's twicing [59j. 

Anisotropic transformations enable simple threshold based denoising al- 
gorithms. While the standard separable wavelet transform cannot exploit 
the smoothness of the edge contour, a menagerie of anisotropic transforms 
have been developed, including ridgelets pTj, curvelets |9], wedgelets \17\ . 
platelets ^£4J, shearlets ^|25l[35], contourlets [IS], bandelets j46ll48]. 
and directionlets [^. As mentioned in the Introduction, among the above al- 
gorithms only wedgelets can obtain the optimal minimax risk for the Horizon 
class; however wedgelets are not suited to denoising textures. One promis- 
ing avenue combing wavelets (for texture denoising) and wedgelets (for edge 
denoising) could follow the path of the image coder in [62j . See [271 126j for 
an overview of anisotropic transformations and their applications in image 
processing. 



25 



Alternative NLM algorithms have been proposed to address the ineffi- 
ciency of using a fixed neighborhood EHESlSniESlEH]. In[l5l[52], 
the authors aggregate several nonlocal estimates to obtain a final estimate. 
The nonlocal estimates are derived from NLM with different neighborhoods 
around each pixel. While these methods improve upon NLM on edges in prac- 
tice, there is no theoretical result to support such empirical results. Similar 
approaches have been exploited in several state-of-the-art denoising algo- 
rithms il4j. In [29l [60l [65], the authors adapt the neighborhood size 
to the local image content. [29] considers different isotropic NLM neigh- 
borhood sizes depending on how smoothly the image content varies. [65j 
uses image gradient information to increase the weights of the NLM along 
edges and decrease them across edges. This method is equivalent to mod- 
ifying the threshold parameter to force NLM to assign higher weights 
to edge-like neighborhoods. [60j sets the neighborhood size based on Stein's 
unbiased estimate of the risk (SURE). Other generalizations include the rota- 
tionally invariant similarity measure [24] and nonlocal variational approaches 
[231 [321 [49j . Unfortunately, these techniques do not reduce the bias that ren- 
ders NLM sub-optimal. 

Finally, data-driven optimality criteria have been considered in [121 EHl 
[50] , where the authors derive lower bounds for the performance of denoising 
algorithms. However, the analyses provided in these papers are not fully 
rigorous and do not cover the performance of NLM for images with sharp 
edges. 

6. Discussion and future directions 

We have introduced and analyzed a framework for anisotropic nonlocal 
means (ANLM) denoising. Similar to NLM, ANLM exploits nonlocal infor- 
mation in estimating the pixel values. However, unlike NLM, ANLM uses 
anisotropic, oriented neighborhoods that can exploit the smoothness of edge 
contours. This enables ANLM to outperform NLM both theoretically and 
empirically. In fact, the performance of ANLM is withtin a logarithmic factor 
of optimal as measured by the minimax rate on the Horizon class of images. 

Numerous questions remain open for future research. From the theoretical 
perspective, the risk analysis of GANLM, the application to noise models 
beyond Gaussian, and the extension to three dimensions and beyond (for 
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seismic, medical, and other data) pose interesting research chaUenges. From 
a practical perspective, the question of how to best tune ANLM to match 
the nuanced edges and textures of real-world images remains open, since we 
have considered only brutal binary images here. Finally, while NLM is no 
longer the state-of-the-art denoising algorithm, it is a key building block in 
several top-performing algorithms. It would be interesting to see whether 
anisotropy pays off as handily for those algorithms as it does for NLM. 



7. Proofs of the main results 



7.1. Preamble 



We first introduce some notation. Define the following partitions of the 
set S = [0, 1]^: 

4 {{v^u)\u>h{v) + {l + C/2)Ss}, 
4 {(y^u) \h{v)-{l + C/2)6s<u<h{v) + {l + C/2)6s}, 
^3 4 {{v,u) \ u<h{v)-{l + C/2)5s}. 

It is important to note that, if (ti,t2) ^ Si and tan(0) = h'{ti)^ then 
h^Ss^Seiti^h) does not overlap with the edge contour. In other words, the 
correctly aligned neighborhood of (^1,^2) ^ Si is always above the edge. The 



points in Ss satisfy a similar property. This is clarified in Figure 12 



We further partition Si into Pi and P2 and Ss into P3 and P4 such that 

Pi 4 {{v^u) \ h{v) + 25i + {C/2)5s<u}, 

P2 = {{v, u) I h{v) + (1 + C/2)5s <u< h{v) + 25^ + (C/2)5J, 

P3 = {[v, u) I h{v) - (1 + C/2)5s >u> h{v) - 25, - (C/2)(^J, 

P4 4 {[v, u)\u< h{v) - 25, - {C/2)5s}. 



Lemma 1. Any neighborhood of pixel (v^u) G Pi will lie completely above 
the edge contour. 

Proof. Let (ti,t2) e Pi. If {u,v) e h^dsAi^iM), then 

t2-v< 5^ (12) 
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Figure 12: Regions S'l, S2 and S-^. The neighborhood of {u^ v) G 6^3 is ahgned 
with the edge contour, and therefore it does not intersect the edge, while the 
neighborhood of {u' ^v') is not ahgned and therefore may intersect with the 
edge. The neighborhoods of the pixels in 5^2 may intersect the edge even 
though they are correctly aligned. 



Pi 




Figure 13: The regions Pi, P2, ^3, ^4 with = (1 + C /2)5s and = 
25(^ — 5s. Every neighborhood of (^1,^2) ^ Pi will lie completely above the 
edge contour. However, some of the neighborhoods of the pixels (^1,^2) ^ P2 
may intersect the edge. A similar property holds for the regions P3 and P4. 
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On the other hand, for t[ G [ti — 5^, ti + S^] we have, 



h{t[) = h{h) + h'{h){t[ - h) + lh"{t';){t[ - 



where t'[ is between ti and t'^. According to the definition of the Horizon 
class, h G Holder^ {C) and hence \h\ti)\ < 1. Therefore, 



h{t[)-h{h)<Se + ^C5, 



Comparing (12) and (13) completes the proof. 



(13) 

□ 



In spite of Pi, some of the neighborhoods of the pixels (v^u) G P2 may 
intersect the edge. Similarly one can prove that any neighborhood of a pixel 
{v^u) G P4 lies completely below the edge, and some of the neighborhoods of 



the pixel {v^ u) G P3 may intersect the edge. Figure 13 displays these regions. 



The following lemma, proved in [41^, will be used in the proofs of the main 
theorems. For the sake of completeness, we sketch the proof here. 

Lemma 2. Let Zi, Z2, . . . , be iid N {0^1) random variables. The Xr 
dom variable Z'f concentrates around its mean with high probability, 

i.e., 



(14) 
(15) 



Proof. Here we prove (14); the proof of (15) follows along very similar lines 



From Markov's Inequality, we have 
P 



Jl5:zfj-i>tj<e-*-E(e?i:r^.-f) 



= 8-"*-" E 6 



(16) 
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The last inequality follows from Lemma 3 in [¥T]. The upper bound proved 
above holds for any r] < ^. To obtain the lowest upper bound we mini- 
mize ^ over 77. The optimal value is 77"*" = argmin^ ^ "^r = ttttttt' 

Plugging r]~^ into (16) proves the lemma. □ 



7.2. Proof of Theorem^ 

In the set up above, we consider three different regions for the point 
(ti,t2)- As we will see in the proof, the risk of all pixels in each region has 
the same upper bound. We calculate these upper bounds and then combine 
them to obtain a master upper bound for the risk of OANLM. 



Case I: (^1,^2) ^ Si. We know that if the anisotropic neighborhood of (^1,^2), 
h,ds,6i is aligned with the edge contour, i.e., tan(0) = h'{ti)^ then it does 
not intersect the edge contour. To calculate the OANLM estimate we first 
calculate the weights. Define 

<f,;f'UuJ2)^f^ f . . dW{s,,S2), (17) 

and 

' ns ' n£ 

F{ti,t2) = / / f{si,S2)dSidS2. 

' ns ' 

For notational simplicity we will use w{si^S2) and 21^^12(31^2) instead of 
^^tftf' {^i^S2) and zj^|/^(ji, ^2), respectively. Define 

Ai = {(7^,7;) G Pi I w{u,v) = 1}, 
A2 = {(7^,7;) G Pa I 77;(77,7;) = 0}. 

Here, let A(-) denote the Lebesgue measure of a set over R^. 
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Lemma 3. Let Se = 2a^/^\log{a)\'^/^ and = 4(7^/3 1 log(a)|4/^ 
2| log(cT)P/3 ne = 41 log{a)\'^/^, and = , ^ . Then, 



P(A(Pi) - A(Ai) > e) = O 
P(A(P4) - A(A2) > e) = O 



Proof. Here we prove the first expression. The proof of the second expression 
foUows along a similar route. Let {u,v) G Pi. Since 'Zti,t2{h^ h) is the 

2 

integral of the Wiener sheet, we have i^nMih) h) ^ ^(0, ^^^^^^ ). Similarly, 
'^u,v{j 1^2) ^ A^(0, ^^^xlf~)' Combining this with Lemma [i] we conclude that 



P ( dl,^^,^{dY{t,,h),dY{u,v)) - > ) < e-^ = O(a^). 



2 \ __.2 



Therefore, 

E(A(^i)) = e[ I{{u,v)eA,)= [ F{{u,v)eA,) 

> [ F(dls^^sMntl,h),dY{u,v))-2'^ff^<r^ 

J{u,v)ePi \ ^sO£ J 

= A(Pi)-0(a«). (18) 

An upper bound for P(A(Pi) — X{Ai) > e) using the Markov inequality yields 
the result 

□ 



This lemma indicates that most of the points in Pi may contribute in the 
estimation of (^1,^2) ^ Si and most of the points in P4 do not contribute in 
the estimate. Define the event S as 

£ ^ {A(Pi) - X{A,) < a'} n {A(P4) - X{A,) < a'}. 
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Note that according to Lemma [sj F{S^) = 0{a^). As we will see later 
this lemma simplifies the calculation of the risk dramatically. For notational 
simplicity we also define the following notations: 

wXdudv = w{u^v)X{u^v)dudv^ 

Xdudv = X{u^v)dudv^ 

wdudv = w{u^v)dudv^ 

Zdudv = Z{u^v)dudv^ 

wZdudv = w{u^v)Z{u^v)dudv. 



The risk of the OANLM estimator at (^1,^2) ^ Si is then given by 



^ ' wXdudv\ 
wdudv J 



Define U 



wXdudv \ ' 
wdudv J 



We have 



E{U) = E{U I S)F{S) + E{U I f ")P(^") < E{U \ S)F{S) + P(f (19) 



Lemma [3] proves that F{£^) = 0{a^). Also, since for (u^v) G w{u^v) = 1, 
we have 



/ wXdudv — I 
Jpi Jpi 



Xdudv 



wXdudv — / wXdudv + / Xdudv — / Xdudv 



/ wXdudv — / wXdudv + / Xdudv — / Xdudv 
Jpi Jai Jai J Pi 



wXdudv 
0{X{Pi\A^)) 



(20) 



For the last inequality we have assumed that the estimate is bounded over 
the Pi region. This assumption can be also justified by calculating the prob- 
ability that this condition does not hold and showing that the probability 
is negligible. Using arguments similar to (20) it is straightforward to prove 
that 
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wXdudv 



P4 



0(A(P4\^)). 



/ wdudv — / dudv 
Jpi Jpi 



= 0{x{PMi))- 



wdudv 



Pi 



Oi\{PM2))- 



(21) 



Define Pu = A U P4 and B = 5'\Pi4. We now calculate an upper bound 



for the first term of (33) 
E([/ I S)F{S) 



(a) 



E 



E 



Jp^^ wXdudv + Jp wXdudv ^ 
Jp^^ wdudv + Jp wdudv 



Jp^ Xdudv + Jp wXdudv\ 
Jp^ dudv + Jp wdudv J 




< E 



(Jp^ Xdudv + Jp wXdudv \ ^ ^ 
Jp^ dudv + Jp wdudv I 



(b) 

< E 



+ 2 



Jp wFdudv 



Jp^ dudv + Jp wdudv 



E 



Jp^ Zdudv + Jp wZdudv \ ' 
Jp^ dudv + Jp wdudv I 



E 



Jp wFdudv 



\ \ Ipi dudv + Jp wdudv J \ \ Ipi dudv + Jp wdudv J 
+ 0{a'). (22) 



E 



Jp^ Zdudv + Jp wZdudv\ 



Equality (a) is an application of (20) and (21). To obtain (b) we plug in the 
value of X = F + Z^ where F and Z represent signal and noise respectively. 
In the next two lemmas, Lemma [22] and Lemma [5| we obtain upper bounds 
for individual terms in (22). 



Lemma 4. Let w{u^v) he the weights of OANLM with Si^Ss^ris^ni, and r^j 
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as specified in Lemma \^ Then 

wFdudv 



E 



Jp^ dudv + Jp wdudv 



<0(a^/^|log(a)|^/^). 



Proof. We have 



E 



Jp wFdudv 



Jp dudv + Jp wdudv 



(a) 

< E 



Jp dudv 
Jp dudv 



(m)' = 0(a^/^|log(a)n. 



To obtain Inequality (a) we maximize the numerator and minimize the 
denominator independently. The last equality is due to the fact that 
X{B) = 0{S,). □ 

Lemma 5. Let w{u^v) be the weights of OANLM with 5£^5s^ns^n£^ and 
as specified in Lemma Then 



E 



Jp Zdudv + Jp wZdudv \ 



Jp^dudv + Jp wdudv j ^ I s( )l ) 



Proof Since Jp wdudv > 0, and we are interested in the upper bound of the 
risk, we can remove it from the denominator to obtain 



E 



Jp^ Zdudv + Jp wZdudv\ 
Jp^ dudv + Jp wdudv j 



< E 



Jp Zdudv + Jp wZdudv\ 



Jp dudv 



, Jp^ Zdudv 
Jp dudv 



■E 



Jp wZdudv\ 
Jp^ dudv j 



Vi 



V2 



+ 2. E 



Jp^ Zdudv \ 



\ \ Ip, dudv J 



i 



( 



E 



Jg w Zdudv 



\ 



dudv 
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Now we obtain upper bounds for ^1,^2, and V3 separately. First, we have 



E 



Jp^ Zdudv 
Jp dudv 



( 



E 



Jp^ Jp^ Z{u, v)Z(u', v')du'dv'dudv 



\ 



J 



< 



25^ 



Fj{Z{u^ v)Z{u\ v'))du'dv'dudv\ 



p ,r , ,^^ff^du'dv'dudv\ 



\ 



^ Jp^ dudv 



O (a^) . 



(23) 



Equality (b) is due to the fact that if {u'^v') ^ I^2Ss 2s^{u^v) then Z{u^v) 

^ ns ^ 

and Z{u'^v') will be independent and hence Fj{Z{u^v)Z{u' ^v')) = 0. In- 
equality (c) is due to the Cauchy-Schwartz inequality \Fj{Z {u^ v) Z {u^ ^ v^))\ < 



To obtain an upper bound for V2, we first note that 
^[w[u, v)Z{u, v)w{u\ v')Z{u', v')) 



< v)Z{u, v)y^I]{w{u', v')Z{u', v')f 



UsTiia 



and hence, 

^2 = E 

= E 
/ 



Jp wZdudv\ 
Jp^ dudv I 

Jp Jp w{u^ v)Z{u^ v)w{u' ^ v')Z{u\ v')dudvdu'dv' 



< 



LlB'^dudvdu'dv''^ 



{Jp dudvy 



V 



dudv 



I 



(24) 



(25) 



Note that the last inequality is due to the fact that \{B) = 0(5^). Using (23) 
and (25) it is straightforward to obtain an upper bound for V3 by using the 
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Cauchy-Schwartz inequality. Combining the upper bounds for Vi^ V2, and V3 
completes the proof. □ 



Combining (22) with Lemmas 4 and 5 establishes the following upper 



bound for Case I, where (^1,^2) ^ Si: 

E / - -^^ ; / \ : ^ = O log a . 26 

\ J^w{u,v)dudv ) 



Case II: (^1,^2) ^ 5^2. In this case we assume that the risk is bounded by 1, 
since the function / is bounded between and 1. If the estimate is out of 
this range, then we will map it to either or 1. 

Case III: (ti,t2) ^ 5^3. This case is exactly the same as Case I , and hence 
we skip the proof. 

Finally, combining our results for Cases I, II, and III, we can calculate an 
upper bound for the risk of OANLM as 

/^) = E(ii/ - /^in = / (/ - j'^fdt.dt^ + [ if- lydt.dh 

JS1US3 JS2 
= A(5iU53)0((T4/3|log(a)r/3) + A(52)0(l) = 0(a4/3|log(cT)r/^). 

This ends the proof of Theorem [2] 

7.3. Proof of Corollary \l\ 

The proof of this corollary follows exactly the same route as that of The- 
orem [2| We merely redefine the regions Sk and Pk for k G {1, 2, 3}. The new 
definition of the Sk regions for the NLM algorithm is given by 

= {(^1,^2) \t2>h{t,)+2S}, 

= {(tl,t2) \t2<hit,)-2S}, 

and = SXS"^ U S^. Since the neighborhoods in NLM are isotropic, we 
require a single parameter to describe the neighborhood size, 6 = 6s = S^. 
For notational simplicity we have assumed that < S. This is a valid 
assumption for small enough values of a, since 5 ^ as a ^ 0. This 
assumption implies that the neighborhood of the pixels in Si and S^ does 
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not intersect with the edge contour. Since the neighborhoods no longer have 
different anisotropic lengths, the size of the intersection of the neighborhoods 
and the edge contour is identical in all directions and there is no need to 
define the regions, Pi, . . . , P4. Equivalently, we set Pi = P4 = and 
P2 = P3 = With these definitions the rest of the proof follows along the 
exact same lines as the proof of Theorem [2] In particular one can show that 
the risk of a pixel (^1,^2) ^ Si U is 0(a^ log^(cr)), and the risk of a point 
(^1,^2) ^ ^2 is 0(1). Furthermore, X{S2) = 0{25) = 0(alog(a)|). Therefore 
the entire risk of NLM is equal to 

R{f, = [ if- f'^fdhdh + [ if- lydhdh 

JS1US3 JS2 
= X{Si U Ss)0{a' log^(a)) + XiS^Ml) = 0{a\ log(a)|). 

7.4. Proof of Theorem\^ 

The proof of this theorem is very similar to the proof of Theorem [2] The 
only diflFerence is in the definitions of Sk and Pk for k G {1, 2, 3}. Since there 
is a mismatch between the orientations of the neighborhood and the edge 
contour, the neighborhood of a point in Si may intersect the edge contour. 
In order to fix this, we define the new regions called and Pf . If the error 
in 9 is upper bounded by c^cr^, then define 

Sf = {{tut2)\t2>hiti) + c^a^Se + 5, + {C/2)Sj}, 
si = {{h, h) I h < hih) - cpa^e -Ss- iC/2)Sj}, 

and S2 — S\Si U S^. Let (^1,^2) ^ Si and set 9 = arctan(/i'(ti)). Consider 
a directional neighborhood of (^1,^2) with direction 0, where \9 — 9\ < Cf^cr^ . 
Then it is straightforward to confirm that 

^^>„(5^(*l'*2) C Si. 

Furthermore, define Pf = Pi, Pf = P4, P^ = \Pi, and P^ = S^\P^. Note 
that the risk of a point (^1,^2) ^ Si can be exactly calculated as in the proof 
of Theorem [2] and therefore, it is upper bounded by 0((7^/^| log(cr)|^/^). Also, 
the risk of the pixels in S2 upper bounded by 0(1). Finally the Lebesgue 
measure of S2 is 0{ci3G^5i + 5s + {C/2)5j). Combining these facts establishes 
the claimed upper bound. 
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7. 5. Proof of Theorem 

Since the proof is mostly similar to that of Theorem [2| we shall focus on 
the major differences. The first difference is that we consider the regions Pi^ 
P4 instead of SiSs. Previously the regions Pi and P2 were treated jointly 
under the region Si. Instead, we shall now consider Pi and P2 separately, 
and their differences shall become progressively apparent. 

Case I: (^1,^2) ^ Pi- We start with the calculation of the weights w{u^ v) for 
{u,v) G P1UP4. Define 

Ri = {{u,v) e Pi I w^{u,v) = 1}, 
i?2 = {{u,v) G P4 I w^{u,v) = 0}. 



Intuitively speaking, we expect most of the pixels in Pi to contribute to 
the estimate of the pixel at (^1,^2), since all their directional neighborhoods 
are very similar to the directional neighborhoods of (^1,^2)- This implies 
that Ri is "close" to Pi. Also, we expect that most of the pixels in P4 do 
not contribute in the estimate since all their directional neighborhoods are 
different from those of (ti,t2) ^ Pi- Lemma |6] provides a formal statement 
of this intuition. 

Lemma 6. Let Ss^Si^Us^rii^t be as defined in Lemma\^ and let q = 7ra~^/^ 
in the DANLM algorithm. Then, 

P(A(Pi) - \{Ri) >e) < O 

/ 22/3 

P(A(P4) - A(i?2) > e) < 

Proof. We use the technique we developed in Lemma [3] for the proof. Con- 
sider a point {u,v) G Pi and let 6* = arctan(/?.'(ti)) and represent an angle 
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from O that is closest to 9*. Then, 

P ( d\{dY{t,M),dY{u,v)) - ^^^^ < 

\ J 

> P (^dl^^^^^{dY{h,h),dYiu,v)) - < r.) 

> l-e-^ = l-0(a«), (27) 

where Inequahty (d) is a direct consequence of Lemma [2j Again as in the 
proof of Lemma [3] we conclude that E(A(Pi) — A(i?i)) = 0{a^). Employing 
this expression in Markov's Inequality leads us to 



P(A(Pi) - A(i?i) > e) < O 




We now prove the second part of this lemma. Consider a point (u^v) G P4. 
Then, 



P (d'^{dY{t,^h)^dY{u^v)) - < r^i 

= P ('min4^,^^,^(rfy(ti,t2),rfl^(^,^)) - ^^^^ < 

• 1 \ ^S^i J 

1=1 ^ ^ 

< 0(a22/3). 

Therefore, we conclude that E(A(P4) — A(i?2)) = 0((7^^/^). Combining this 
with Markov's Inequality establishes the second part of [6) □ 

Define the event as 

^ A ^x{P,) - A(i?i) < a^} n {A(P4) - A(i?2) < a^}. 
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The risk of the OANLM estimator at (ti,t2) ^ Pi is then given by 

\ wdudv J 

Define U = — t^t^I • We have 

E{U) = E{U I J')P(J') + E{U I J'^)P(J'^) < E{U I J')P(J') + P(J'^). (28) 

Lemma [g] confirms that P(J^^) = 0(cr-^^/^). Therefore, we should obtain an 
upper bound for the term E([/ | J^)P(J^). This step of the proof is the same 
as the proof of Case I in Theorem [2| and therefore we do not repeat it here. 

The more chaUenging step of the proof is to obtain an upper bound for 
the risk of DANLM for pixels (^1,^2) ^ i^2. 

Case II: (^1,^2) ^ ^2- In this case we again start with defining the following 
two sets: 

L^ ^ {Kt;)GPi K, = l}, 
L2 ^ {K^;)GP4K. = 0}. 

As before, our objective is to show that most of the pixels in Pi con- 
tribute in the final estimate of DANLM and most of the pixels in P4 do not 
contribute. The following lemma formalizes this intuition. 

Lemma 7. Let Ss^S^^ns^n^^ be as defined in Lemma\^ and let q = ttct"^/^. 
Then^ 

P(A(Pi) - A(Li) > e) < 

Proof. We first prove that for some ^ G O, Ie,Ss,Si{tiih) does not intersect 
the edge. Suppose that 9* is such that tan(0*) = h'{ti)^ and consider 6 = 
argmin5i^o |^ — ^*|. To ensure that the neighborhood ^^(^1,^2) does not 
intersect with the edge, we need to have 

t2 + td.n{9){t' -ti)-5s > h{ti) + h' {ti) {t' - ti) 

+ (29) 
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for all t' G + Also, t" G \ti^t']. According to the mean value 

theorem we have tan(0) = tan(0*) + i^tan2(6>0 ^^^ where = — and 
6^' G [— 7r/4, 7r/4], since in our Horizon model we have assumed \h'{ti)\ < 1. 
Furthermore, \h%f')\ < C2 and < a^/^ < ^S^. Therefore, wiU be 
satisfied if 

ttS 

t2 > h{ti) + ^5^ + 5s + C25l 

This constraint holds for all pixels in the region P2. Therefore, the neighbor- 
hood I§ ^^(^1,^2) lies completely in the region above the h. Therefore, 

P (d'AdYit,,t2),dY{u,v)) - < T^ 

= P (n^ndl^s.^sM^{t,,h),dY{u,v)) - '^JM^ < 

> F(dj^ ^{dYih,h),dY{u,v))-^^^'^^<T^ 

> 1-e ^ = l-0{a^). (30) 

where the Inequality (e) is a consequence of [2] By repeating the same 
argument as the argument in the proof of Lemma [3] we conclude that 
E(A(Pi) — A(Li)) = 0{a^). Employing this expression in Markov's Inequality 
leads us to 



P(A(Pi) - A(Li) > e) < O 



□ 

Lemma 8. Let 5^, 5^, rig, n^, t„ he as defined in Lemma\^ and let q = vra"^/^. 
Then, 

'10/3 

P(A(P4) - A(L2) > e) < O' 



Proof. We first prove that at least half of the area of l0^s^^s^{u^v) is located 
below the edge. For notational simplicity we assume that h'{ti) = 0. Con- 
sider Figure 14, in which the neighborhood has an arbitrary orientation. Let 
w,v be as defined in this figure. We then have that 



d 



cos cos 9 



Sg tan 9 



d — Cw'^ — SsSm9 d — Cw'^ — Ss 

> , -, (31) 



cos 6 



cos 9 
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where d = t2 — h{ti). Suppose d is such that, at the correct angle d — Cw'^ — 
6s > 0, i.e., d > {C + l)a^^^. According to (31), v > 0^ and hence more than 
half of the area of the neighborhood is in the white region (below the edge 
contour). Therefore, the number of pixels in this region is 41og^ a±o(log^ a). 
Consider a point (u^v) G P4. Then, 



P (d'^{dY{t,^t,)^dY{u^v)) - < r^l 



2nsn£a 



2 



= P mmdi,^,^^,^{dY{t,,t2),dY{u,v)) - < r, 

< (dls^,,idYit^,h),dYiu,v)) - < r.) 

• 1 \ ^s^£ J 

1=1 ^ ^ 

To derive Inequality (g) we have used the fact that ©(2(5^(5s) of the the pixels 
in each neighborhood are equal to 0. To consider the worst case we assumed 
the rest of the values are equal to 1. Using the same technique as in the 
proof of Lemma |6] in combination with (32) we obtain 

'^10/3 



P(A(P4) - A(L2) > e) < O 



□ 
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Based on Lemmas [7] and [s] we can calculate the risk of DANLM for (ti,t2) ^ 
P2. Define the event Q as 



g 4 {A(Pi) - A(Li) < a'} n {A(P4) - KL2) < a'}. 



The risk of the OANLM estimator at (^1,^2) ^ P2 is then given by 



E([/) = E([/ 1 e;)p(^) + E([/ 1 ^^)P(^^) < E([/ 1 e;)p(^) + p(6;^). (33) 

Lemmas [7] and [s] imply that ¥{Q^ = 0{a^^^). Furthermore, following along 
the same lines of case I, we can prove E([/ | ^)P(^) = 0((7^/^| log(a)|^/^|). 

Case IIL (ti,t2) ^ 5^2. The proof of this case is identical to that of Case II 
in the proof of Theorem [2} 

Case IV: (^1,^2) G U P4. The proof of this case is similar to that for the 
regions Pi and P2 in Cases I and II and therefore is skipped here. 

Finally, combining the upper bounds obtained in Cases I, II, III, and IV 
completes the proof of the theorem. 
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Appendix A. Minimax risk of the mean filter 

In this appendix, we obtain the decay rate of the risk of the mean filter 
for the Horizon class of images. Our proof is similar to the proof in [4J. 
Nevertheless, we repeat the proof here for the sake of completeness and since 




2 
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our continuous framework is slightly different from the discrete framework in 
[4]. See [TlJ for the generalization of this result. 

The classical mean filter estimates the image via 

/— r — 

where A specifies the size of the window on which averaging takes place. 
Theorem 5 ([4J). If is the estimate of the mean filter^ then 

inf sup R{fJ^^) = e{a'/'). 

Proof We first derive a lower bound for /^^). Consider a function 
fhitiih) with h{ti) = 1/2 (recall Figure [2]) and define 

Qa = {{tuh) I 1/2 - A < t2 < 1/2 + A}. 

It is straightforward to confirm that if (ti,t2) ^ then Ifhih^h) — 

Ef^^{ti,t2)\ > |. Therefore, if Bias(/^^) denotes the bias of the mean filter 
estimator, then we have 

Bias2(/^^) = [ \Mh,h)-Ef^''{t,,h)\'dt,dh 
Js 

> [ \fh{tl,t2)-Ef'^^{h,t2)\'dhdt2>^A. (A.l) 

Now consider a point (^1,^2) G S\Qa- We have 
E(/^^(ti,ti)-E/^^(ti,t2)r 

E{dW{ti - ri,t2 - T2)dW{ti - t[M - T^)) 



A A A A 

2 / 2 / 2 / 2 



A , 



A4 



A2 

Therefore, if Var(/'^^) is the variance of the mean filter estimator, then we 
have 



Var(/^^)> / E(/^^(ti,ti)-E/''^(ti,t2)r = ^(l-4A). (A.2) 

JS\Q4A ^ 
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I 1 1 1 2 1 2 

By combining (A.l) and (A. 2) we obtain the lower bound of + — 4^ 



A2 ' 32^ ""A 

for the risk of the mean filter estimator. If we minimize this lower bound 
over A then we obtain the lower bound of B(a^/^) for the risk. 

Now, we derive an upper bound for the risk. Define the region 
i?A = {(ti,t2) I h{ti)-A<t2<h{ti) + A}. 

It is straightforward to confirm that, if (^1,^2) ^ S\Ra^ then the A- 

neighborhood of this point does not intersect the edge contour. Therefore, 

2 

the bias of the estimator over this region is zero and the variance is If we 
bound the risk of the points in the Ra region by 1, we obtain the following 
upper bound for the risk. 



Again by optimizing the upper bound over A we obtain the upper bound of 
9(a^/^). This completes the proof. □ 
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Appendix B. Minimax risk of NLM 

Corollary [1] states the following upper bound for the risk of NLM: 

sup i?(/,/^) = 0(a|loga|). 

feH<-{C) 

In this section we prove that the risk of NLM is lower bounded by Q{a). 
The proof is similar to the proof of Theorem 5 in [41j; we consider the 
function //i(ti,t2) for ^(^i) = 1/2 and prove that the bias of the NLM on 
(ti^h) ^ Qi is ©(1) for any choice of the threshold parameter. However, 

n 

in the continuous setting considered here, the steps are more challenging. 
Following [41] we consider the semi-oracle NLM algorithm (SNLM). The 
semi-oracle 5-distance is defined as 

dj{dY{ti,t2),dY{si,S2)) 



where 



Xti.tzOl'is) = / . . f{Si,S2)dSidS2. 
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SNLM then estimates the weights according to 



. ^ . f 1 ii4{dY{t,M),dY{s,,S2))<^ + T,, 
\ otherwise. 



It is clear that the distance estimates of the SNLM algorithm are more accu- 
rate than those of NLM algorithm. Hence it outperforms NLM, and a lower 
bound that holds for SNLM will hold for NLM as weU. 

We make the following mild assumptions on the parameters of SNLM. 



Al: The window size 5 ^ as a ^ 0. 

A2: g = ^^(a^). Otherwise m]{dY{ti,t2),dY{si,S2)) ^ oc as a ^ 0. 

A3: n ^ oc as a ^ 0. This ensures that, if two points (^1,^2) and (^i, ^2) 
have the same neighborhoods, then Wt^^t2{^i^^2) = 1 with high proba- 
bility. 

A4: Ud{f{h,t2),f{si,S2)) > 1/4, then F{wl,^{s,, s^) = 1) = o{a'). 

Again, consider the function //j(ti,t2) for ^(^i) = 1/2 (recall Figure [2]). 
Let (^1,^2) ^ Qs/n- For notational simplicity we use w{si^S2) instead of 

Lemma 9. // \si — ti\ > S/2 and \s[ — ti\ > S/2, then 

F{wl,^{suS2) = 1) = F{wl,^{s[,S2) = 1) 
for any 1 1 , ^2 , , ^2 and s[ . 

The proof is straightforward and hence skipped here. Note that we we do 
not consider the boundary points whose neighborhoods are not subset of 
S 

Consider a point (^1,^2) ^ Gs/n- 

Lemma 10. If \s — ti\ > S/2, for u < 5/4:, then we have 

PK^^,,^(s,t2 -U) = l)= PK,,,(s,t2 + U) = l) 
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This lemma is a straightforward apphcation of symmetry, and hence we skip 
the proof. 

Lemma 11. Suppose that 5 and To- satisfy Al-A^- Then we have 
P / w{si^S2)dsids2 > cr^ = o{g). 

\Js\Qs/2 J 

Proof. Define 

B = {{si, S2) G S\Qs/2 I ^^(^1, 52) = 1} 

and the event 

g = {X{B) > a^}. 

We have 

E(A(S)) = e// I{{u,v) e B)dudv 

J Js\Qs/2 

= [ F{{u, v) G B)dudv = o(a^) (B.l) 



Equahty (i) is due to Assumption (A4); at least quarter of the pixel values 
in the isotropic neighborhood of any (u^v) G S\Qs/2 are different from the 
corresponding pixel values of the neighborhood around (^1,^2) G Qs/n- Using 
the Markov Inequality it is straightforward to show that 

(r \ E 1 w(u,v)dudv 
/ w{u, v)dudv >aA< ^'^"^'^^ : 



(72 



o{a), (B.2) 



where the last equality is the result of (B.l). Finally, if X{B) < a , then 



w{u^v)dudv = O(cr^). 

S\Qs/2 

This completes the proof. □ 
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Our next step toward the proof of a lower bound for the risk of NLM 
algorithm is to show that the bias of the pixels that are in Q^/n is B(l). To 
achieve this goal we show that if (^1,^2) Qbjn is above the edge and (51,52) 
is below the edge then there is a non-vanishing chance (not converging to 
zero as a ^ 0) that (51,52) may contribute to the estimate of (^1,^2)- The 
following theorem is a formal presentation of this claim. 

Proposition 1. Let (ti, ^2), (5i, 52) G Q^jn- Then there exists po > inde- 
pendent of a such that for every tn^S we have 

Proof We need to demonstrate that regions above and below the edge con- 
tour are included in the NLM algorithm with a constant non-zero probability 
Pq. First, we use the definition of the weights of the NLM algorithm to de- 
termine the probability that w{u^v) = 1 

P«^,^(5i,52) = 1)=F U{dY{t,,t2),dY{s,,S2) < ^ + 

where s^^p = z^'^'^^{£^p). Using the Berry-Esseen Central Limit Theorem for 
independent non-identically distributed random variables [5lj , we can easily 
bound this probability away from zero. For more details, see Proposition 1 
in IH] . Note that according to Assumption A2 we have ^ = 0(1). □ 

We now consider the weights for the regions above and below the edge 
contour separately. Therefore, we define 

Ql = {{ti,t2) I l/2<t2<l/2 + A/4}, 
Ql = {iti,t2) I l/2-A/4<t2<l/2}, 
^l,s = {(^1,^2) I 1/2 <t2< 1/2 + A/4, < ti < 2,5}, 



^Is = {(ti,t2) I 1/2- A/4<t2 < 1/2, 0<ti <25}. 



Let 
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Note that according to Lemma |9] this probabihty does not depend on v and 
hence we have used the notation p^^a- 

Lemma 12. Ifw{u^v) denotes the weights used in the NLM algorithm^ then 
we have 



P 



P 



w{u^ v)dudv 
/ w{u^v)dudv 



Pu,adu 



log(a) 



AS 



log(a) 



AS 



o 



o 



a 



a 



Proof. The weights w(u, v) over the pixelated neighborhoods can be rewritten 
in the foUowing way 

/ w{u^v)dudv = / w{u + 2k6^ v)dudv. 



Applying the Hoeffding inequahty, it is straightforward to confirm that 



P 



2(5 



w{u + 2k5^ v) — 



Pv 



k=l 



25 



>t \ < 2e 



(B.3) 



Now, defining 
we see that 

p(A(r2X) - A(ri) > e) 



w{u + 2M, v) 



Pu,, 



k=l 



26 



< t 



= p ( / (1 - g rX)rfttrfw > e 
{!{u,v)m\ (1 - p((«, f ) G ri)d«dt- 



(m) 2e-2'5*' 

< 
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where (ii) is due to Markov Inequality and Inequality (iii) is due to (B.3). 
Furthermore, 



/ w{u + 2kS^ v)dudv — [ dudv 

A k=l A 

w{u + 2k5,v)-'^^^ 



< 



(iv) 



k=l 



dudv 



< tX{Tl) < tX{nl) < tA5, (B.4) 
where Inequality (iv) is due to the definition of r\. Define the event 

J^^{X{Ql)-X{Tl)<e}. 
Given holds, we conclude X{^\) — A(r^) < e and hence 



< 



< 



w{u^v)dudv — I p{u^(j)dudv 



dudv 



w{u + 2/c5, v)dudv 

A k=l 

J_ 
2(5 

w{u + 2k5^ v)dudv — / w{u + 2fc(^, v)dudv 



+ 



+ 



^A /c=l A k = l 

p(2i,cr 



A 



y^ + 2A;5, v)dudv 



^ A 



-dudv 



dudv — 



dudv 



(v) 6 

< 2-+tA(5, 





where Inequality (v) is due to (B.4) and the fact that we have assumed 
holds. Setting t = \/ completes the proof. □ 
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Theorem 6. Suppose that S, t^- and n satisfy Assumptions AI-A4. Then 
the risk of SNLM is 

inf sup R{fJ') = n{a). 



Proof. Let (^1,^2) ^ ^l/n' Consider the following two definitions from the 



proofs of Lemma [TT] and Lemma 12 



Q = {A(5) > a'}, 

where 

B = {{si, S2) G S\Qs/2 I ^(^1, 52) = 1}, 

and 

Note that according to Lemmas [TT] and [12] 



P(jr'^ug^) = o(a). 



We will calculate a lower bound for the risk of NLM. We have 

E (f(t^ t2) - /5^(^'^)-^(^'^)^^^^ y > i^i 

\ ' jgw{u,v)dudv J ~ \ V jgw{u,v)dudv J 
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which leads us to towards the lower bound 



E 



' J^w{u^v)X{u^v)dudv\^ >Ei /s' ^)"^(^' ^)^^^^ 

^Q(5/2 ^^^^ v)dudv 



> 



(vi) 

> 



(vii) 

< 



(viii) 
< 



J^w{u^v)dudv J 

w{u^ v)X{u^ v)dudv 



E 



E 



> E 



^Q8/2 ^(^^ v)dudv 

w{u^ v)X{u^ v)dudv 
4^^p„,.d«cii;-2f -I log(a) 1(53/2 

w{u^ v)X{u^ v)dudv 
fg w{u, v)f{u, v)dudv 



Li w(u,v)dudv 



a' 



log(cT)|(53/2 




jQ^^^Pu,adudv-2^ -t5^ 
7qi Pu,adudv + 2^ + \\og{a)\S^/^ + a 



■fQs/2 Pn,adudv 



2f - tS^ 



> E 



^Qi Vu,adudv + 2^ + 1 log((j)|53/2 ^ ^2 



(5/2 



a + 2 p„,,ci«cit; - 2^ - | log(a) 1^3/2 

^(5/2 " 



Inequality (vi) is an immediate consequence of Lemma 12, Inequality (vii) 



is due to Lemma 11, Finally Inequality (viii) is the result of Lemma [12} 
The minimum of the last line is achieved when Li Pu^jdudv is minimized. 

^^(5/4 

However, we have proved in Proposition jl that p{u) > po for (u^v) G Ql/^^- 
Therefore, J^i Pu^adudv is lower bounded by ©(^) which is equal to Q{a) 

according to Assumption A2. When we substitute this optimum value in the 
lower bound (B.5), we see that the risk over this region is B(l). Therefore 
the bias of the NLM over the entire image is ^{^) or, equivalently, Q{a) 
according to Assumption A2. □ 
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t2 




Figure C.15: A sample function from J^z. In this figure, Q 
1, 2, . . . , m — 1. 



1 for i 



Appendix C. Proof of Theorem [T] 

Here we focus on the case of a = 2 and use the standard technique of 
hypercube construction to estabhsh the lower bound [10]. Let (j) : [0, 1] 
U {0} be a two-times diflFerentiable function with ||^||oo = 1, 0(0) = 

^(l) = 0, Slo- .,11 
define 



#L =0, and g 



d^cf> 



0. For i = 0, 1, . . . , m — 1 



''"^ 1 otherwise 



Consider the function /o : [0,1]^ [0,1], fo — l{t2<o.5}, and define 
V^i,m = l{t2<(/>^,m(tl)+o.5} - fo- Finally, define 



J^m = {f0 + Y^C^i^^Atl^h), ^ {O, l}}- 



i=l 



A sample function of this class is displayed in Figure |C.15[ Since J^z C 
H'^{C), we have 



inf sup E||/-/||^>infsupE||/-/||^. 



(C.l) 



53 



The right hand side of (C.l) can be calculated more easily, since we can 
restrict our attention to the estimators of the form /o + YlT=i d^i^m- This is 
due to the fact that if Pjr^ is the projection onto Jvz, then for every / G J^rn 
we have 

\\P^M) -f\\l< \\PtM) - P^M)\\l < 11/ - fWl 

Furthermore for f = fo + Ya=i CiV'i,m(^i, ^2) and f = fo + EI^i 6^i,m(^i, ^2) 
we have 

ll/-/ll^ = «:™||C-CII^, (C.2) 

where Km = ||'0i,m(^ii ^2)!^ and satisfies 

>im=Ui,m{tl,t2W= [ [ i^lmitl,t2) = [ / ^2) = e(m-^). 

Jti Jt2 Jti Jt2 

According to (C.2) the original problem reduces to one of estimating (. 
Therefore, we reduce the problem to the problem of estimating . . . Xm 
from the observations yf^, . . . , given by 



We have 



where wf A^(0, a^/^^). Consider the problem of estimating d from the 
observation d + A^(0,a^//^^) and set m to the smallest integer for which 
(^'^/i^m ^ 1- For this choice of m the risk of any estimator is lower bounded 
by a constant B. Combining this with (C.2) we conclude that minimax risk 
defined in (C.l) is lower bounded by BrnKm- It is straightforward to confirm 
that m = a~^^ satisfies the condition cr'^/i^rn ^ 1- Therefore we finish the 
proof by setting m = cr~^/^ and k = a'^. 
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